Recognizing Objects and Scenes in News Videos

نویسندگان

  • Muhammet Bastan
  • Pinar Duygulu Sahin
چکیده

We propose a new approach to recognize objects and scenes in news videos motivated by the availability of large video collections. This approach considers the recognition problem as the translation of visual elements to words. The correspondences between visual elements and words are learned using the methods adapted from statistical machine translation and used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text. The results show that the retrieval performance can be improved by associating visual and textual elements. Also, extensive analysis of features are provided and a method to combine features are proposed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of TV sports news by DCT features using multiple subspace method

This paper proposes a method to classify automatically TV sports news articles using image processing and classification techniques. The classification algorithm of TV sports news articles is based on a multiple subspace method that provides a sports category with more than one subspaces corresponding to the typical scenes. The classification is performed without recognizing any objects in an i...

متن کامل

Online multiple people tracking-by-detection in crowded scenes

Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...

متن کامل

On the usefulness of attention for object recognition

Today’s object recognition systems have become very good at learning and recognizing isolated objects or objects in images with little clutter. However, unsupervised learning and recognition in highly cluttered scenes or in scenes with multiple objects are still problematic. Faced with the same issue, the brain employs selective visual attention to select relevant parts of the image and to seri...

متن کامل

Finding Person X: Correlating Names with Visual Appearances

People as news subjects carry rich semantics in broadcast news video and therefore finding a named person in the video is a major challenge for video retrieval. This task can be achieved by exploiting the multi-modal information in videos, including transcript, video structure, and visual features. We propose a comprehensive approach for finding specific persons in broadcast news videos by expl...

متن کامل

Statistical Background Modeling Based on Velocity and Orientation of Moving Objects

Background modeling is an important step in moving object detection and tracking. In this paper, we propose a new statistical approach in which, a sequence of frames are selected according to velocity and direction of some moving objects and then an initial background is modeled, based on the detection of gray pixel's value changes. To have used this sequence of frames, no estimator or distribu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006